AITopics

2512.13642

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsOct-10-2025, 12:56:30 GMT

RandNet-Parareal: a time-parallel PDE solver using Random Neural Networks

Parallel-in-time (PinT) techniques have been proposed to solve systems of time-dependent differential equations by parallelizing the temporal domain.

algorithm, equation, randnet-parareal, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Livermore (0.04)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceMay-8-2025

Explainable AI for Correct Root Cause Analysis of Product Quality in Injection Moulding

Muaz, Muhammad, Sajid, Sameed, Schulze, Tobias, Liu, Chang, Klasen, Nils, Drescher, Benny

If a product deviates from its desired properties in the injection moulding process, its root cause analysis can be aided by models that relate the input machine settings with the output quality characteristics. The machine learning models tested in the quality prediction are mostly black boxes; therefore, no direct explanation of their prognosis is given, which restricts their applicability in the quality control. The previously attempted explainability methods are either restricted to tree-based algorithms only or do not emphasize on the fact that some explainability methods can lead to wrong root cause identification of a product's deviation from its desired properties. This study first shows that the interactions among the multiple input machine settings do exist in real experimental data collected as per a central composite design. Then, the model-agnostic explainable AI methods are compared for the first time to show that different explainability methods indeed lead to different feature impact analysis in injection moulding. Moreover, it is shown that the better feature attribution translates to the correct cause identification and actionable insights for the injection moulding process. Being model agnostic, explanations on both random forest and multilayer perceptron are performed for the cause analysis, as both models have the mean absolute percentage error of less than 0.05% on the experimental dataset.

artificial intelligence, interaction, machine learning, (18 more...)

doi: 10.1016/j.jmapro.2025.03.114

2505.01445

Country:

Asia > China > Hong Kong (0.14)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Materials (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Sorokin, Lev, Biagiola, Matteo, Stocco, Andrea

Simulator Ensembles for Trustworthy Autonomous Driving Testing

arXiv.org Artificial IntelligenceMar-11-2025

Scenario-based testing with driving simulators is extensively used to identify failing conditions of automated driving assistance systems (ADAS) and reduce the amount of in-field road testing. However, existing studies have shown that repeated test execution in the same as well as in distinct simulators can yield different outcomes, which can be attributed to sources of flakiness or different implementations of the physics, among other factors. In this paper, we present MultiSim, a novel approach to multi-simulation ADAS testing based on a search-based testing approach that leverages an ensemble of simulators to identify failure-inducing, simulator-agnostic test scenarios. During the search, each scenario is evaluated jointly on multiple simulators. Scenarios that produce consistent results across simulators are prioritized for further exploration, while those that fail on only a subset of simulators are given less priority, as they may reflect simulator-specific issues rather than generalizable failures. Our case study, which involves testing a deep neural network-based ADAS on different pairs of three widely used simulators, demonstrates that MultiSim outperforms single-simulator testing by achieving on average a higher rate of simulator-agnostic failures by 51%. Compared to a state-of-the-art multi-simulator approach that combines the outcome of independent test generation campaigns obtained in different simulators, MultiSim identifies 54% more simulator-agnostic failing tests while showing a comparable validity rate. An enhancement of MultiSim that leverages surrogate models to predict simulator disagreements and bypass executions does not only increase the average number of valid failures but also improves efficiency in finding the first valid failure.

multisim, simulator, test input, (17 more...)

2503.08936

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

Ballarin, Giovanni, Grigoryeva, Lyudmila, Ortega, Juan-Pablo

Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?

arXiv.org Machine LearningFeb-7-2025

Memory capacity of nonlinear recurrent networks: Is it informative? Abstract The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This note shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending just on the input process scale. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.

artificial intelligence, deep learning, machine learning, (17 more...)

2502.04832

Country:

Asia > Singapore (0.04)
Europe > United Kingdom (0.04)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Zozoulenko, Nikita, Cass, Thomas, Gonon, Lukas

Random Feature Representation Boosting

arXiv.org Machine LearningJan-30-2025

We introduce Random Feature Representation Boosting (RFRBoost), a novel method for constructing deep residual random feature neural networks (RFNNs) using boosting theory. RFRBoost uses random features at each layer to learn the functional gradient of the network representation, enhancing performance while preserving the convex optimization benefits of RFNNs. In the case of MSE loss, we obtain closed-form solutions to greedy layer-wise boosting with random features. For general loss functions, we show that fitting random feature residual blocks reduces to solving a quadratically constrained least squares problem. We demonstrate, through numerical experiments on 91 tabular datasets for regression and classification, that RFRBoost significantly outperforms traditional RFNNs and end-to-end trained MLP ResNets, while offering substantial computational advantages and theoretical guarantees stemming from boosting theory.

artificial intelligence, data mining, machine learning, (19 more...)

2501.18283

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.93)

Gattiglio, Guglielmo, Grigoryeva, Lyudmila, Tamborrino, Massimiliano

RandNet-Parareal: a time-parallel PDE solver using Random Neural Networks

arXiv.org Machine LearningNov-9-2024

Parallel-in-time (PinT) techniques have been proposed to solve systems of time-dependent differential equations by parallelizing the temporal domain. Among them, Parareal computes the solution sequentially using an inaccurate (fast) solver, and then "corrects" it using an accurate (slow) integrator that runs in parallel across temporal subintervals. This work introduces RandNet-Parareal, a novel method to learn the discrepancy between the coarse and fine solutions using random neural networks (RandNets). RandNet-Parareal achieves speed gains up to x125 and x22 compared to the fine solver run serially and Parareal, respectively. Beyond theoretical guarantees of RandNets as universal approximators, these models are quick to train, allowing the PinT solution of partial differential equations on a spatial mesh of up to $10^5$ points with minimal overhead, dramatically increasing the scalability of existing PinT approaches. RandNet-Parareal's numerical performance is illustrated on systems of real-world significance, such as the viscous Burgers' equation, the Diffusion-Reaction equation, the two- and three-dimensional Brusselator, and the shallow water equation.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

2411.06225

Country:

North America > United States > Texas (0.14)
Europe > United Kingdom (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Haboury, Nathan, Kordzanganeh, Mo, Melnikov, Alexey, Sekatski, Pavel

Information plane and compression-gnostic feedback in quantum machine learning

arXiv.org Artificial IntelligenceNov-4-2024

The information plane (Tishby et al. arXiv:physics/0004057, Shwartz-Ziv et al. arXiv:1703.00810) has been proposed as an analytical tool for studying the learning dynamics of neural networks. It provides quantitative insight on how the model approaches the learned state by approximating a minimal sufficient statistics. In this paper we extend this tool to the domain of quantum learning models. In a second step, we study how the insight on how much the model compresses the input data (provided by the information plane) can be used to improve a learning algorithm. Specifically, we consider two ways to do so: via a multiplicative regularization of the loss function, or with a compression-gnostic scheduler of the learning rate (for algorithms based on gradient descent). Both ways turn out to be equivalent in our implementation. Finally, we benchmark the proposed learning algorithms on several classification and regression tasks using variational quantum circuits. The results demonstrate an improvement in test accuracy and convergence speed for both synthetic and real-world datasets. Additionally, with one example we analyzed the impact of the proposed modifications on the performances of neural networks in a classification task.

artificial intelligence, information, machine learning, (17 more...)

2411.02313

Country:

North America > United States > California (0.05)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Bermeitinger, Bernhard, Hrycej, Tomas, Pavone, Massimo, Kath, Julianus, Handschuh, Siegfried

Reducing the Transformer Architecture to a Minimum

arXiv.org Artificial IntelligenceOct-29-2024

Transformers are a widespread and successful model architecture, particularly in Natural Language Processing (NLP) and Computer Vision (CV). The essential innovation of this architecture is the Attention Mechanism, which solves the problem of extracting relevant context information from long sequences in NLP and realistic scenes in CV. A classical neural network component, a Multi-Layer Perceptron (MLP), complements the attention mechanism. Its necessity is frequently justified by its capability of modeling nonlinear relationships. However, the attention mechanism itself is nonlinear through its internal use of similarity measures. A possible hypothesis is that this nonlinearity is sufficient for modeling typical application problems. As the MLPs usually contain the most trainable parameters of the whole model, their omission would substantially reduce the parameter set size. Further components can also be reorganized to reduce the number of parameters. Under some conditions, query and key matrices can be collapsed into a single matrix of the same size. The same is true about value and projection matrices, which can also be omitted without eliminating the substance of the attention mechanism. Initially, the similarity measure was defined asymmetrically, with peculiar properties such as that a token is possibly dissimilar to itself. A possible symmetric definition requires only half of the parameters. We have laid the groundwork by testing widespread CV benchmarks: MNIST and CIFAR-10. The tests have shown that simplified transformer architectures (a) without MLP, (b) with collapsed matrices, and (c) symmetric similarity matrices exhibit similar performance as the original architecture, saving up to 90% of parameters without hurting the classification performance.

machine learning, natural language, variant, (19 more...)

doi: 10.5220/0012891000003838

2410.13732

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Vienna (0.14)
North America > United States (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)